skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Editors contains: "Espinal, X"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. De_Vita, R; Espinal, X; Laycock, P; Shadura, O (Ed.)
    Differentiable Programming could open even more doors in HEP analysis and computing to Artificial Intelligence/Machine Learning. Current common uses of AI/ML in HEP are deep learning networks – providing us with sophisticated ways of separating signal from background, classifying physics, etc. This is only one part of a full analysis – normally skims are made to reduce dataset sizes by applying selection cuts, further selection cuts are applied, perhaps new quantities calculated, and all of that is fed to a deep learning network. Only the deep learning network stage is optimized using the AI/ML gradient decent technique. Differentiable programming offers us a way to optimize the full chain, including selection cuts that occur during skimming. This contribution investigates applying selection cuts in front of a simple neural network using differentiable programming techniques to optimize the complete chain on toy data. There are several well-known problems that must be solved – e.g., selection cuts are not differentiable, and the interaction of a selection cut and a network during training is not well understood. This investigation was motived by trying to automate reduced dataset skims and sizes during analysis – HL-LHC analyses have potentially multi-TB dataset sizes and an automated way of reducing those dataset sizes and understanding the trade-offs would help the analyser make a judgement between time, resource usages, and physics accuracy. This contribution explores the various techniques to apply a selection cut that are compatible with differentiable programming and how to work around issues when it is bolted onto a neural network. Code is available. 
    more » « less
  2. De_Vita, R; Espinal, X; Laycock, P; Shadura, O (Ed.)
    The IceCube Neutrino Observatory is a cubic kilometer neutrino telescope located at the geographic South Pole. To accurately and promptly reconstruct the arrival direction of candidate neutrino events for Multi-Messenger Astrophysics use cases, IceCube employs Skymap Scanner workflows managed by the SkyDriver service. The Skymap Scanner performs maximum-likelihood tests on individual pixels generated from the Hierarchical Equal Area isoLatitude Pixelation (HEALPix) algorithm. Each test is computationally independent, which allows for massive parallelization. This workload is distributed using the Event Workflow Management System (EWMS)—a message-based workflow management system designed to scale to trillions of pixels per day. SkyDriver orchestrates multiple distinct Skymap Scanner workflows behind a REST interface, providing an easy-to-use reconstruction service for real-time candidate, cataloged, and simulated events. Here, we outline the SkyDriver service technique and the initial development of EWMS. 
    more » « less
  3. De_Vita, R; Espinal, X; Laycock, P; Shadura, O (Ed.)
    The Big Science projects common of multi-institute particle-physics collaborations generates unique needs for member management, including paper authorship tracking, shift assignments, subscription to mailing lists and access to 3rd party applications such as Github and Slack. For smaller collaborations under 200 people, often no facility for centralized member management is available and these needs are usually manually handled by long-term members despite the management becoming untenable as collaborations grow. To automate many of these tasks for the expanding XENON collaboration, we developed the XENONnT User Management Website, a web application that stores and updates data related to the collaboration members through the use of Node.js and MongoDB. We found that web frameworks are so mature and approachable such that a student can develop a good system to meet the unique needs of the collaboration. The application allows for the scheduling of shifts for members to coordinate between institutes. User manipulation of 3rd party applications are implemented using REST API integration. The XENONnT User Management Website is open source and is a show case of quick implementation of utility application using the web framework, which demonstrated the utility of web-based approaches for solving specific problems to aid the logistics of running Big Science collaborations. 
    more » « less
  4. De_Vita, R; Espinal, X; Laycock, P; Shadura, O (Ed.)
    Effective metadata management is a consistent challenge faced by many scientific experiments. These challenges are magnified by the evolving needs of the experiment, the intricacies of seamlessly integrating a new system with existing analytical frameworks, and the crucial mandate to maintain database integrity. In this work we present the various challenges faced by experiments that produce a large amount of metadata and describe the solution used by the XENON experiment for metadata management. 
    more » « less
  5. De_Vita, R; Espinal, X; Laycock, P; Shadura, O (Ed.)
    Providing computing training to the next generation of physicists is the principal driver for a biannual multi-day training workshop hosted by the DUNE Computing Consortium. Materials are cast in a Software Carpentry’s template, and topics have included storage space, data management, LArSoft, grid job submission and monitoring. Moreover, experts provide extended breakout sessions to demonstrate the fundamentals of the unique software used in HEP analysis. Each session uses live documents for real time correspondence, and are captured on Zoom; afterwards, videos are embedded on the corresponding web-pages for review. As a GitHub repository, shared editing of the learning modules is straightforward, and provides a trusted framework to extend to other training topics in the future. An overview of the tutorials as well as the machinery used, along with survey statistics and lessons learned is presented. 
    more » « less
  6. De_Vita, R; Espinal, X; Laycock, P; Shadura, O (Ed.)
    Predicting the performance of various infrastructure design options in complex federated infrastructures with computing sites distributed over a wide area network that support a plethora of users and workflows, such as the Worldwide LHC Computing Grid (WLCG), is not trivial. Due to the complexity and size of these infrastructures, it is not feasible to deploy experimental test-beds at large scales merely for the purpose of comparing and evaluating alternate designs. An alternative is to study the behaviours of these systems using simulation. This approach has been used successfully in the past to identify efficient and practical infrastructure designs for High Energy Physics (HEP). A prominent example is the Monarc simulation framework, which was used to study the initial structure of the WLCG. New simulation capabilities are needed to simulate large-scale heterogeneous computing systems with complex networks, data access and caching patterns. A modern tool to simulate HEP workloads that execute on distributed computing infrastructures based on the SimGrid and WRENCH simulation frameworks is outlined. Studies of its accuracy and scalability are presented using HEP as a case-study. Hypothetical adjustments to prevailing computing architectures in HEP are studied providing insights into the dynamics of a part of the WLCG and candidates for improvements. 
    more » « less
  7. De_Vita, R; Espinal, X; Laycock, P; Shadura, O (Ed.)
    The efficiency of high energy physics workflows relies on the ability to rapidly transfer data among the sites where the data is processed and analyzed. The best data transfer tools should provide a simple and reliable solution for local, regional, national and in some cases intercontinental data transfers. This work outlines the results of data transfer tool tests using internal and external (simulated latency and packet loss) in 100 Gbps testbeds and compares the results among the existing solutions, while also treating the issue of tuning parameters and methods to help optimize the rates of transfers. Many tools have been developed to facilitate data transfers over wide area networks. However, few studies have shown the tools’ requirements, use cases, and reliability through comparative measurements. Here, we were evaluating a variety of high-performance data transfer tools used today in the LHC and other scientific communities, such as FDT, WDT, and NDN in different environments. Furthermore, this test was made to reproduce real-world data transfer examples to analyse each tool’s strengths and weaknesses, including the fault tolerance of the tools when we have packet loss. By comparing the tools in a controlled environment, we can shed light on the tool’s relative reliability and usability for academia and industry. Also, this work highlights the best tuning parameters for WAN and LAN transfers for maximum performance, in several cases. 
    more » « less
  8. De_Vita, R; Espinal, X; Laycock, P; Shadura, O (Ed.)
    The Large Hadron Collider (LHC) experiments distribute data by leveraging a diverse array of National Research and Education Networks (NRENs), where experiment data management systems treat networks as a “blackbox” resource. After the High Luminosity upgrade, the Compact Muon Solenoid (CMS) experiment alone will produce roughly 0.5 exabytes of data per year. NREN Networks are a critical part of the success of CMS and other LHC experiments. However, during data movement, NRENs are unaware of data priorities, importance, or need for quality of service, and this poses a challenge for operators to coordinate the movement of data and have predictable data flows across multi-domain networks. The overarching goal of SENSE (The Software-defined network for End-to-end Networked Science at Exascale) is to enable National Labs and universities to request and provision end-to-end intelligent network services for their application workflows leveraging SDN (Software-Defined Networking) capabilities. This work aims to allow LHC Experiments and Rucio, the data management software used by CMS Experiment, to allocate and prioritize certain data transfers over the wide area network. In this paper, we will present the current progress of the integration of SENSE, Multi-domain end-to-end SDN Orchestration with QoS (Quality of Service) capabilities, with Rucio, the data management software used by CMS Experiment. 
    more » « less
  9. De_Vita, R; Espinal, X; Laycock, P; Shadura, O (Ed.)
    The large data volumes expected from the High Luminosity LHC (HL-LHC) present challenges to existing paradigms and facilities for end-user data analysis. Modern cyberinfrastructure tools provide a diverse set of services that can be composed into a system that provides physicists with powerful tools that give them straightforward access to large computing resources, with low barriers to entry. The Coffea-Casa analysis facility (AF) provides an environment for end users enabling the execution of increasingly complex analyses such as those demonstrated by the Analysis Grand Challenge (AGC) and capturing the features that physicists will need for the HL-LHC. We describe the development progress of the Coffea-Casa facility featuring its modularity while demonstrating the ability to port and customize the facility software stack to other locations. The facility also facilitates the support of batch systems while staying Kubernetes-native. We present the evolved architecture of the facility, such as the integration of advanced data delivery services (e.g. ServiceX) and making data caching services (e.g. XCache) available to end users of the facility. We also highlight the composability of modern cyberinfrastructure tools. To enable machine learning pipelines at coffee-casa analysis facilities, a set of industry ML solutions adopted for HEP columnar analysis were integrated on top of existing facility services. These services also feature transparent access for user workflows to GPUs available at a facility via inference servers while using Kubernetes as enabling technology. 
    more » « less
  10. De_Vita, R; Espinal, X; Laycock, P; Shadura, O (Ed.)
    The OSG-operated Open Science Pool is an HTCondor-based virtual cluster that aggregates resources from compute clusters provided by several organizations. Most of the resources are not owned by OSG, so demand-based dynamic provisioning is important for maximizing usage without incurring excessive waste. OSG has long relied on GlideinWMS for most of its resource provisioning needs but is limited to resources that provide a Grid-compliant Compute Entrypoint. To work around this limitation, the OSG Software Team has developed a glidein container that resource providers could use to directly contribute to the OSPool. The problem with that approach is that it is not demand-driven, relegating it to backfill scenarios only. To address this limitation, a demand-driven direct provisioner of Kubernetes resources has been developed and successfully used on the NRP. The setup still relies on the OSG-maintained backfill container image but automates the provisioning matchmaking and successive requests. That provisioner has also been extended to support Lancium, a green computing cloud provider with a Kubernetes-like proprietary interface. The provisioner logic has been intentionally kept very simple, making this extension a low-cost project. Both NRP and Lancium resources have been provisioned exclusively using this mechanism for many months. 
    more » « less